Speaker Comparison with Inner Product Discriminant Functions
نویسندگان
چکیده
Speaker comparison, the process of finding the speaker similarity between two speech signals, occupies a central role in a variety of applications—speaker verification, clustering, and identification. Speaker comparison can be placed in a geometric framework by casting the problem as a model comparison process. For a given speech signal, feature vectors are produced and used to adapt a Gaussian mixture model (GMM). Speaker comparison can then be viewed as the process of compensating and finding metrics on the space of adapted models. We propose a framework, inner product discriminant functions (IPDFs), which extends many common techniques for speaker comparison—support vector machines, joint factor analysis, and linear scoring. The framework uses inner products between the parameter vectors of GMM models motivated by several statistical methods. Compensation of nuisances is performed via linear transforms on GMM parameter vectors. Using the IPDF framework, we show that many current techniques are simple variations of each other. We demonstrate, on a 2006 NIST speaker recognition evaluation task, new scoring methods using IPDFs which produce excellent error rates and require significantly less computation than current techniques.
منابع مشابه
Addressing the Data-Imbalance Problem in Kernel-Based Speaker Verification via Utterance Partitioning and Speaker Comparison
GMM-SVM has become a promising approach to textindependent speaker verification. However, a problematic issue of this approach is the extremely serious imbalance between the numbers of speaker-class and impostor-class utterances available for training the speaker-dependent SVMs. This data-imbalance problem can be addressed by (1) creating more speaker-class supervectors for SVM training through...
متن کاملLearning the decision function for speaker verification
This paper explores the possibility to replace the usual thresholding decision rule of log likelihood ratios used in speaker verification systems by more complex and discriminant decision functions based for instance on Linear Regression models or Support Vector Machines. Current speaker verification systems, based on generative models such as HMMs or GMMs, can indeed easily be adapted to use s...
متن کاملDiscrimination of Speakers Using Tone and Formant Dynamics in Thai
Dynamic properties of speech have been identified as offering greater potential than static features to discriminate between speakers. They may therefore offer useful evidence in forensic speaker comparison analyses. This exploratory study assesses the speaker specificity of diphthong and tone trajectories in Thai. Data were analysed from five male standard Thai speakers. Discriminant analysis ...
متن کاملi-vector Based Speaker Recognition on Short Utterances
Robust speaker verification on short utterances remains a key consideration when deploying automatic speaker recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave when training and testing utterance lengths are reduced. Results are presented which provi...
متن کاملA proposed decision rule for speaker recognition based on fuzzy c-means clustering
In vector quantisation (VQ) based speaker recognition, the minimum overall average distortion rule is used as a criterion to assign a given sequence of acoustic vectors to a speaker model known as a codebook. An alternative decision rule based on fuzzy c-means clustering is proposed in this paper. A set of membership functions associated with vectors for codebooks are defined as discriminant fu...
متن کامل